[Bugfix] LoRA for DeepSeek V3.2#35077
Conversation
There was a problem hiding this comment.
Code Review
This pull request addresses two LoRA regressions related to DeepSeek V3.2/DSA models. The changes primarily involve modifying type checks from type(obj) is Class to isinstance(obj, Class) to correctly handle subclasses, and introducing a mechanism to unwrap LoRA linear wrappers before accessing quantization metadata. The added test cases validate these fixes, ensuring that LoRA modules are registered correctly and weight post-processing functions can access quant_method attributes as expected. The changes are well-targeted and directly resolve the reported issues, improving the robustness of LoRA integration with various model architectures.
There was a problem hiding this comment.
Pull request overview
This PR fixes two LoRA regressions encountered with DeepSeek V3.2/DSA that prevented LoRA adapters from being applied to the custom DeepSeekV2FusedQkvAProj layer, which is a subclass of MergedColumnParallelLinear.
Changes:
- Modified LoRA layer replacement logic to support subclasses of
MergedColumnParallelLinearby changingtype() ischecks toisinstance()checks - Added unwrapping logic in
get_and_maybe_dequant_weights()to handle LoRA wrappers transparently by accessing the underlyingbase_layer - Added comprehensive test coverage for both fixes
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| vllm/model_executor/layers/quantization/utils/quant_utils.py | Adds automatic unwrapping of LoRA wrappers in get_and_maybe_dequant_weights() to access the base layer's quantization metadata |
| vllm/lora/layers/column_parallel_linear.py | Changes type checks from type() is to isinstance() for MergedColumnParallelLinear to support custom subclasses like DeepSeekV2FusedQkvAProj |
| tests/lora/test_layers.py | Adds test cases for subclassed MergedColumnParallelLinear layer replacement and for get_and_maybe_dequant_weights() with LoRA wrappers |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
70ae7a3 to
0b1d296
Compare
37a5cc3 to
8430f84
Compare
0e2c03e to
fe5f7d2
Compare
|
This pull request has merge conflicts that must be resolved before it can be |
fe5f7d2 to
283c7e2
Compare
|
Hi @HollowMan6, the pre-commit checks have failed. Please run: uv pip install pre-commit>=4.5.1
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
dd3c039 to
162299b
Compare
a5e6ebb to
dd5e7fd
Compare
Signed-off-by: Hollow Man <hollowman@opensuse.org>
jeejeelee
left a comment
There was a problem hiding this comment.
Overall LGTM except the final comment.
| @@ -202,6 +204,12 @@ def all_gather(self, input_: torch.Tensor, dim: int = -1) -> torch.Tensor: | |||
| + (self.world_size * input_size[dim],) | |||
| + input_size[dim + 1 :] | |||
| ) | |||
| # When the gathered dimension has size 1, torch.compile can preserve a | |||
There was a problem hiding this comment.
I think we should move these changes to lora/
There was a problem hiding this comment.
Thank you, I just found that this change is actually not necessary after the other fix, so I removed the related changes here.
Signed-off-by: Hollow Man <hollowman@opensuse.org>
Head branch was pushed to by a user without write access
|
Hi @HollowMan6, the pre-commit checks have failed. Please run: uv pip install pre-commit>=4.5.1
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
Signed-off-by: Hollow Man <hollowman@opensuse.org>
19863b9 to
a9a8629
Compare
Signed-off-by: Hollow Man <hollowman@opensuse.org>
| finally: | ||
| # Note: for some reason DeepEP buffers don't seem to be | ||
| # entirely reusable on B200. In order to work around this | ||
| # we clear the all2all manager's cache after each testpoint. | ||
| cap = current_platform.get_device_capability() | ||
| if ( | ||
| cap is not None | ||
| and cap.major == 10 | ||
| and ( | ||
| test_config.backend == "deepep_low_latency" | ||
| or test_config.backend == "deepep_high_throughput" | ||
| ) | ||
| ): | ||
| # DeepEP managers are not reliably reusable across many subtests in | ||
| # a single worker process. Tear them down after each DeepEP case so | ||
| # later subtests do not inherit stale communication state. | ||
| if test_config.backend in { | ||
| "deepep_low_latency", | ||
| "deepep_high_throughput", | ||
| }: |
There was a problem hiding this comment.
Hmm I don't think so as all the CIs were passed before merge, including the specific one you mentioned: https://buildkite.com/vllm/ci/builds/62466/steps/canvas?sid=019db40c-c48c-4238-b1ca-827533eb7d09&tab=output
Also d22887b was introduced specifically for fixing that CI.
There was a problem hiding this comment.
@yewentao256 The error will appear actually before d22887b was landed, maybe you forgot to update your local branch? https://buildkite.com/vllm/ci/builds/62124/steps/canvas?jid=019daad3-f858-4b61-b22e-765a9c78a2e4&tab=output#019daad3-f858-4b61-b22e-765a9c78a2e4
There was a problem hiding this comment.
Ohh make sense, the nightly build happens at 2am
Signed-off-by: Hollow Man <hollowman@opensuse.org> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
Signed-off-by: Hollow Man <hollowman@opensuse.org> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> Co-authored-by: Jee Jee Li <pandaleefree@gmail.com> Signed-off-by: Yifan <yzong@redhat.com>
Signed-off-by: Hollow Man <hollowman@opensuse.org> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> Co-authored-by: Jee Jee Li <pandaleefree@gmail.com> Signed-off-by: Avinash Singh <avinashsingh.rcoem@gmail.com>
Signed-off-by: Hollow Man <hollowman@opensuse.org> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> Co-authored-by: Jee Jee Li <pandaleefree@gmail.com> Signed-off-by: Adrian <info@zzit.ch>
Purpose
This PR fixes LoRA regressions seen with DeepSeek V3.2/DSA:
Test Plan
Added the unit test cases, and also with end to end test manually.
Test Result
All pass without the above error.
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.✨ Presented to you with Mind Lab - A Lab for Experiential Intelligence.